Learning representative temporal features for action recognition
نویسندگان
چکیده
In this paper, a novel video classification method is presented that aims to recognize different categories of third-person videos efficiently. Our motivation achieve light model could be trained with insufficient training data. With intuition, the processing 3-dimensional input broken 1D in temporal dimension on top 2D spatial. The processes related spatial frames are being done by utilizing pre-trained networks no phase. only step which involves classify time series resulted from description signals. As matter fact, optical flow images first calculated consecutive and described CNN networks. Their then reduced using PCA. By stacking vectors beside each other, multi-channel created for video. Each channel represents specific feature follows it over time. main focus proposed obtained effectively. Towards this, idea let machine learn features. This one dimensional Convolutional Neural Network (1D-CNN). 1D-CNN learns features along dimension. Hence, number parameters decreases significantly would result trainability even smaller datasets. It illustrated reach state-of-the-art results two public datasets UCF11, jHMDB competitive HMDB51.
منابع مشابه
Learning Representative Temporal Features for Action Recognition
in this paper we present a novel video classification methodology that aims to recognize different categories of third-person videos efficiently. The idea is to tracking motion in videos and extracting both short-term and long-term features from motion time series by training a multichannel one dimensional Convolutional Neural Network (1DCNN). The positive point about our method is that we only...
متن کاملEvaluation of Local Spatio-temporal Features for Action Recognition
Local space-time features have recently become a popular video representation for action recognition. Several methods for feature localization and description have been proposed in the literature and promising recognition results were demonstrated for a number of action classes. The comparison of existing methods, however, is often limited given the different experimental settings used. The pur...
متن کاملLearning Spatio-Temporal Features for Action Recognition with Modified Hidden Conditional Random Field
Previous work on human action analysis mainly focuses on designing hand-crafted local features and combining their context information. In this paper, we propose using supervised feature learning as a way to learn spatio-temporal features. More specifically, a modified hidden conditional random field is applied to learn two high-level features conditioned on a certain action label. Among them, ...
متن کاملAppendix Learning Hierarchical Invariant Spatio-Temporal Features for Action Recognition with Independent Subspace Analysis
For all the datasets, we use the same pipeline with Wang et. al [5]. This pipleline first extracts features (or in the case Wang et. al [5], descriptors such as HOG3D) from videos on a dense grid in which cube samples overlap 50% in x, y and t dimensions. K-means vector quantization is applied on the extracted features and each video is histogramed to form a bag-of-words representation. Finally...
متن کاملLearning discriminative features for fast frame-based action recognition
In this paper we present an instant action recognition method, which is able to recognize an action in real-time from only two continuous video frames. For the sake of instantaneity, we employ two types of computationally efficient but perceptually important features – optical flow and edges – to capture motion and shape characteristics of actions. It is known that the two types of features can...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Multimedia Tools and Applications
سال: 2021
ISSN: ['1380-7501', '1573-7721']
DOI: https://doi.org/10.1007/s11042-021-11022-8